Search CORE

566 research outputs found

Recovering non-local dependencies for Chinese

Author: Guo Yuqing
van Genabith Josef
Wang Haifeng
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2007
Field of study

To date, work on Non-Local Dependencies (NLDs) has focused almost exclusively on English and it is an open research question how well these approaches migrate to other languages. This paper surveys non-local dependency constructions in Chinese as represented in the Penn Chinese Treebank (CTB) and provides an approach for generating proper predicate-argument-modifier structures including NLDs from surface contextfree phrase structure trees. Our approach recovers non-local dependencies at the level of Lexical-Functional Grammar f-structures, using automatically acquired subcategorisation frames and f-structure paths linking antecedents and traces in NLDs. Currently our algorithm achieves 92.2% f-score for trace insertion and 84.3% for antecedent recovery evaluating on gold-standard CTB trees, and 64.7% and 54.7%, respectively, on CTBtrained state-of-the-art parser output trees

Irish Universities

DCU Online Research Access Service

Treebank-based acquisition of Chinese LFG resources for parsing and generation

Author: Guo Yuqing
Publication venue: Dublin City University. School of Computing
Publication date: 01/11/2009
Field of study

This thesis describes a treebank-based approach to automatically acquire robust,wide-coverage Lexical-Functional Grammar (LFG) resources for Chinese parsing and generation, which is part of a larger project on the rapid construction of deep, large-scale, constraint-based, multilingual grammatical resources. I present an application-oriented LFG analysis for Chinese core linguistic phenomena and (in cooperation with PARC) develop a gold-standard dependency-bank of Chinese f-structures for evaluation. Based on the Penn Chinese Treebank, I design and implement two architectures for inducing Chinese LFG resources, one annotation-based and the other dependency conversion-based. I then apply the f-structure acquisition algorithm together with external, state-of-the-art parsers to parsing new text into "proto" f-structures. In order to convert "proto" f-structures into "proper" f-structures or deep dependencies, I present a novel Non-Local Dependency (NLD) recovery algorithm using subcategorisation frames and f-structure paths linking antecedents and traces in NLDs extracted from the automatically-built LFG f-structure treebank. Based on the grammars extracted from the f-structure annotated treebank, I develop a PCFG-based chart generator and a new n-gram based pure dependency generator to realise Chinese sentences from LFG f-structures. The work reported in this thesis is the first effort to scale treebank-based, probabilistic Chinese LFG resources from proof-of-concept research to unrestricted, real text. Although this thesis concentrates on Chinese and LFG, many of the methodologies, e.g. the acquisition of predicate-argument structures, NLD resolution and the PCFG- and dependency n-gram-based generation models, are largely language and formalism independent and should generalise to diverse languages as well as to labelled bilexical dependency representations other than LFG

Irish Universities

DCU Online Research Access Service

Treebank-based acquisition of LFG resources for Chinese

Author: Guo Yuqing
van Genabith Josef
Wang Haifeng
Publication venue: CSLI Publications
Publication date: 01/01/2007
Field of study

This paper presents a method to automatically acquire wide-coverage, robust, probabilistic Lexical-Functional Grammar resources for Chinese from the Penn Chinese Treebank (CTB). Our starting point is the earlier, proofof- concept work of (Burke et al., 2004) on automatic f-structure annotation, LFG grammar acquisition and parsing for Chinese using the CTB version 2 (CTB2). We substantially extend and improve on this earlier research as regards coverage, robustness, quality and fine-grainedness of the resulting LFG resources. We achieve this through (i) improved LFG analyses for a number of core Chinese phenomena; (ii) a new automatic f-structure annotation architecture which involves an intermediate dependency representation; (iii) scaling the approach from 4.1K trees in CTB2 to 18.8K trees in CTB version 5.1 (CTB5.1) and (iv) developing a novel treebank-based approach to recovering non-local dependencies (NLDs) for Chinese parser output. Against a new 200-sentence good standard of manually constructed f-structures, the method achieves 96.00% f-score for f-structures automatically generated for the original CTB trees and 80.01%for NLD-recovered f-structures generated for the trees output by Bikel’s parser

Irish Universities

DCU Online Research Access Service

An analysis of question processing of English and Chinese for the NTCIR 5 cross-language question answering task

Author: Guo Yuqing
Jones Gareth J.F.
Judge John
Wang Bin
Publication venue: 'National Institute of Informatics (NII)'
Publication date: 01/01/2005
Field of study

An important element in question answering systems is the analysis and interpretation of questions. Using the NTCIR 5 Cross-Language Question Answering (CLQA) question test set we demonstrate that the accuracy of deep question analysis is dependent on the quantity and suitability of the available linguistic resources. We further demonstrate that applying question analysis tools developed on monolingual training materials to questions translated Chinese-English and English-Chinese using machine translation produces much reduced effectiveness in interpretation of the question. This latter result indicates that question analysis for CLQA should primarily be conducted in the question language prior to translation

CiteSeerX

Irish Universities

DCU Online Research Access Service

ACTS in Need: Automatic Configuration Tuning with Scalability Guarantees

Author: Bao Yungang
Guo Mengying
Liu Jianxun
Ma Wenlong
Zhu Yuqing
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 07/08/2017
Field of study

To support the variety of Big Data use cases, many Big Data related systems expose a large number of user-specifiable configuration parameters. Highlighted in our experiments, a MySQL deployment with well-tuned configuration parameters achieves a peak throughput as 12 times much as one with the default setting. However, finding the best setting for the tens or hundreds of configuration parameters is mission impossible for ordinary users. Worse still, many Big Data applications require the support of multiple systems co-deployed in the same cluster. As these co-deployed systems can interact to affect the overall performance, they must be tuned together. Automatic configuration tuning with scalability guarantees (ACTS) is in need to help system users. Solutions to ACTS must scale to various systems, workloads, deployments, parameters and resource limits. Proposing and implementing an ACTS solution, we demonstrate that ACTS can benefit users not only in improving system performance and resource utilization, but also in saving costs and enabling fairer benchmarking

arXiv.org e-Print Archive

Crossref

Poly(ethylene glycol)-conjugated surfactants promote or inhibit aggregation of phospholipids

Author: Guo Yuqing
Hui Sek Wen
Publication venue: Elsevier Science B.V.
Publication date: 31/01/1997
Field of study

AbstractThe calcium-induced aggregation of dilauroyl phosphatidic acid (DLPA) suspensions, with or without added poly(ethylene oxide) (PEO)-conjugated surfactants containing 4 to 30 ethylene oxide subunits, were monitored by turbidity measurement and quasi-elastic light scattering (QLS). The aggregation was inhibited (protected) by the incorporated PEO surfactant for most samples, while a window for promotive effect was found for samples with low surface coverage by the PEO moiety of the incorporated surfactant. Promotion occurs only when the aggregation is slow and at a low level. The promotion is explained by the synergistic effect of PEO and divalent calcium cations when the steric repulsion is weak. The promotion/protection crossover is a display between the PEO/calcium synergistic effect and the steric repulsion

Elsevier - Publisher Connector

Live and let die: asymmetric dimethylarginine and septic shock

Author: Guo Yuqing
van Genabith Josef
Wang Haifeng
Publication venue: BioMed Central
Publication date: 03/11/2006
Field of study

Nitric oxide (NO) is an important mediator of host defence and of vascular tone. In septic shock, upregulation of inducible NO synthase leads to the production of vast amounts of NO, which contribute to pathogen elimination but also to inappropriate vasodilation and to loss of vascular resistance. Asymmetric dimethylarginine (ADMA) is an endogenous inhibitor of NO synthases shown to contribute to the regulation of vascular tone. ADMA was recently identified as a marker of organ dysfunction and mortality in intensive care patients and as a novel cardiovascular risk factor. In the present issue of Critical Care, a study by O'Dwyer and colleagues identifies ADMA as a potential regulator of NO production in septic shock. Being an inhibitor of NO production, ADMA may at least partly counteract pathological hypotension, but at the same time may impair the NO-dependent host defence. A mechanism is proposed by which the interplay between ADMA and inducible NO synthase activity is mediated. ADMA levels should be determined in future studies evaluating the regulation of NO in the intensive care setting

Crossref

Irish Universities

PubMed Central

DCU Online Research Access Service

BestConfig: Tapping the Performance Potential of Systems via Automatic Configuration Tuning

Author: Bao Yungang
Guo Mengying
Liu Jianxun
Liu Zhuoyue
Ma Wenlong
Song Kunpeng
Yang Yingchun
Zhu Yuqing
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 10/10/2017
Field of study

An ever increasing number of configuration parameters are provided to system users. But many users have used one configuration setting across different workloads, leaving untapped the performance potential of systems. A good configuration setting can greatly improve the performance of a deployed system under certain workloads. But with tens or hundreds of parameters, it becomes a highly costly task to decide which configuration setting leads to the best performance. While such task requires the strong expertise in both the system and the application, users commonly lack such expertise. To help users tap the performance potential of systems, we present BestConfig, a system for automatically finding a best configuration setting within a resource limit for a deployed system under a given application workload. BestConfig is designed with an extensible architecture to automate the configuration tuning for general systems. To tune system configurations within a resource limit, we propose the divide-and-diverge sampling method and the recursive bound-and-search algorithm. BestConfig can improve the throughput of Tomcat by 75%, that of Cassandra by 63%, that of MySQL by 430%, and reduce the running time of Hive join job by about 50% and that of Spark join job by about 80%, solely by configuration adjustment

arXiv.org e-Print Archive

Crossref